Tell Me Dave: Context-Sensitive Grounding of Natural Language to Manipulation Instructions
نویسندگان
چکیده
We consider performing a sequence of mobile manipulation tasks with instructions given in natural language (NL). Given a new environment, even a simple task such as of boiling water would be performed quite differently depending on the presence, location and state of the objects. We start by collecting a dataset of task descriptions in free-form natural language and the corresponding grounded task-logs of the tasks performed in an online robot simulator. We then build a library of verbenvironment-instructions that represents the possible instructions for each verb in that environment—these may or may not be valid for a different environment and task context. We present a model that takes into account the variations in natural language, and ambiguities in grounding them to robotic instructions with appropriate environment context and task constraints. Our model also handles incomplete or noisy NL instructions. Our model is based on an energy function that encodes such properties in a form isomorphic to a conditional random field. In evaluation, we show that our model produces sequences that perform the task successfully in a simulator and also significantly outperforms the state-of-the-art. We also verify by executing our output instruction sequences on a PR2 robot.
منابع مشابه
Detecting Target Objects by Natural Language Instructions Using an RGB-D Camera
Controlling robots by natural language (NL) is increasingly attracting attention for its versatility, convenience and no need of extensive training for users. Grounding is a crucial challenge of this problem to enable robots to understand NL instructions from humans. This paper mainly explores the object grounding problem and concretely studies how to detect target objects by the NL instruction...
متن کاملTemporal Grounding Graphs for Language Understanding with Accrued Visual-Linguistic Context
A robot’s ability to understand or ground natural language instructions is fundamentally tied to its knowledge about the surrounding world. We present an approach to grounding natural language utterances in the context of factual information gathered through natural-language interactions and past visual observations. A probabilistic model estimates, from a natural language utterance, the object...
متن کاملGrounding Abstract Spatial Concepts for Language Interaction with Robots
Our goal is to develop models that allow a robot to understand or “ground” natural language instructions in the context of its world model. Contemporary approaches estimate correspondences between an instruction and possible candidate groundings such as objects, regions and goals for a robot’s action. However, these approaches are unable to reason about abstract or hierarchical concepts such as...
متن کاملEfficient Grounding of Abstract Spatial Concepts for Natural Language Interaction with Robot Manipulators
Our goal is to develop models that allow a robot to understand natural language instructions in the context of its world representation. Contemporary models learn possible correspondences between parsed instructions and candidate groundings that include objects, regions and motion constraints. However, these models cannot reason about abstract concepts expressed in an instruction like, “pick up...
متن کاملGated-Attention Architectures for Task-Oriented Language Grounding
To perform tasks specified by natural language instructions, autonomous agents need to extract semantically meaningful representations of language and map it to visual elements and actions in the environment. This problem is called taskoriented language grounding. We propose an end-to-end trainable neural architecture for task-oriented language grounding in 3D environments which assumes no prio...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- I. J. Robotics Res.
دوره 35 شماره
صفحات -
تاریخ انتشار 2014